35 research outputs found
Recommended from our members
Nasics: A `Fabric-Centric\u27 Approach Towards Integrated Nanosystems
This dissertation addresses the fundamental problem of how to build computing systems for the nanoscale. With CMOS reaching fundamental limits, emerging nanomaterials such as semiconductor nanowires, carbon nanotubes, graphene etc. have been proposed as promising alternatives. However, nanoelectronics research has largely focused on a `device-first\u27 mindset without adequately addressing system-level capabilities, challenges for integration and scalable assembly.
In this dissertation, we propose to develop an integrated nano-fabric, (broadly defined as nanostructures/devices in conjunction with paradigms for assembly, inter-connection and circuit styles), as opposed to approaches that focus on MOSFET replacement devices as the ultimate goal. In the `fabric-centric\u27 mindset, design choices at individual levels are made compatible with the fabric as a whole and minimize challenges for nanomanufacturing while achieving system-level benefits vs. scaled CMOS.
We present semiconductor nanowire based nano-fabrics incorporating these fabric-centric principles called NASICs and N3ASICs and discuss how we have taken them from initial design to experimental prototype. Manufacturing challenges are mitigated through careful design choices at multiple levels of abstraction. Regular fabrics with limited customization mitigate overlay alignment requirements. Cross-nanowire FET devices and interconnect are assembled together as part of the uniform regular fabric without the need for arbitrary fine-grain interconnection at the nanoscale, routing or device sizing. Unconventional circuit styles are devised that are compatible with regular fabric layouts and eliminate the requirement for using complementary devices.
Core fabric concepts are introduced and validated. Detailed analyses on device-circuit co-design and optimization, cascading, noise and parameter variation are presented. Benchmarking of nanowire processor designs vs. equivalent scaled 16nm CMOS shows up to 22X area, 30X power benefits at comparable performance, and with overlay precision that is achievable with present-day technology. Building on the extensive manufacturing-friendly fabric framework, we present recent experimental efforts and key milestones that have been attained towards realizing a proof-of-concept prototype at dimensions of 30nm and below
Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators
Analog in-memory computing (AIMC) -- a promising approach for
energy-efficient acceleration of deep learning workloads -- computes
matrix-vector multiplications (MVMs) but only approximately, due to
nonidealities that often are non-deterministic or nonlinear. This can adversely
impact the achievable deep neural network (DNN) inference accuracy as compared
to a conventional floating point (FP) implementation. While retraining has
previously been suggested to improve robustness, prior work has explored only a
few DNN topologies, using disparate and overly simplified AIMC hardware models.
Here, we use hardware-aware (HWA) training to systematically examine the
accuracy of AIMC for multiple common artificial intelligence (AI) workloads
across multiple DNN topologies, and investigate sensitivity and robustness to a
broad set of nonidealities. By introducing a new and highly realistic AIMC
crossbar-model, we improve significantly on earlier retraining approaches. We
show that many large-scale DNNs of various topologies, including convolutional
neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can
in fact be successfully retrained to show iso-accuracy on AIMC. Our results
further suggest that AIMC nonidealities that add noise to the inputs or
outputs, not the weights, have the largest impact on DNN accuracy, and that
RNNs are particularly robust to all nonidealities.Comment: 35 pages, 7 figures, 5 table
Neuromorphic computing using non-volatile memory
Dense crossbar arrays of non-volatile memory (NVM) devices represent one possible path for implementing massively-parallel and highly energy-efficient neuromorphic computing systems. We first review recent advances in the application of NVM devices to three computing paradigms: spiking neural networks (SNNs), deep neural networks (DNNs), and âMemcomputingâ. In SNNs, NVM synaptic connections are updated by a local learning rule such as spike-timing-dependent-plasticity, a computational approach directly inspired by biology. For DNNs, NVM arrays can represent matrices of synaptic weights, implementing the matrixâvector multiplication needed for algorithms such as backpropagation in an analog yet massively-parallel fashion. This approach could provide significant improvements in power and speed compared to GPU-based DNN training, for applications of commercial significance. We then survey recent research in which different types of NVM devices â including phase change memory, conductive-bridging RAM, filamentary and non-filamentary RRAM, and other NVMs â have been proposed, either as a synapse or as a neuron, for use within a neuromorphic computing application. The relevant virtues and limitations of these devices are assessed, in terms of properties such as conductance dynamic range, (non)linearity and (a)symmetry of conductance response, retention, endurance, required switching power, and device variability.11Yscopu
Recommended from our members
NASICs: A \u27fabric-centric\u27 approach towards integrated nanosystems
This dissertation addresses the fundamental problem of how to build computing systems for the nanoscale. With CMOS reaching fundamental limits, emerging nanomaterials such as semiconductor nanowires, carbon nanotubes, graphene etc. have been proposed as promising alternatives. However, nanoelectronics research has largely focused on a \u27device-first\u27 mindset without adequately addressing system-level capabilities, challenges for integration and scalable assembly. In this dissertation, we propose to develop an integrated nano-fabric, (broadly defined as nanostructures/devices in conjunction with paradigms for assembly, interconnection and circuit styles), as opposed to approaches that focus on MOSFET replacement devices as the ultimate goal. In the \u27fabric-centric\u27 mindset, design choices at individual levels are made compatible with the fabric as a whole and minimize challenges for nanomanufacturing while achieving system-level benefits vs. scaled CMOS. We present semiconductor nanowire based nano-fabrics incorporating these fabric-centric principles called NASICs and N3ASICs and discuss how we have taken them from initial design to experimental prototype. Manufacturing challenges are mitigated through careful design choices at multiple levels of abstraction. Regular fabrics with limited customization mitigate overlay alignment requirements. Cross-nanowire FET devices and interconnect are assembled together as part of the uniform regular fabric without the need for arbitrary fine-grain interconnection at the nanoscale, routing or device sizing. Unconventional circuit styles are devised that are compatible with regular fabric layouts and eliminate the requirement for using complementary devices. Core fabric concepts are introduced and validated. Detailed analyses on device-circuit co-design and optimization, cascading, noise and parameter variation are presented. Benchmarking of nanowire processor designs vs. equivalent scaled 16nm CMOS shows up to 22X area, 30X power benefits at comparable performance, and with overlay precision that is achievable with present-day technology. Building on the extensive manufacturing-friendly fabric framework, we present recent experimental efforts and key milestones that have been attained towards realizing a proof-of-concept prototype at dimensions of 30nm and below
Regular 2D Nasic-based Architecture and Design Space Exploration
International audienceAs CMOS technology approaches its physical limits several emerging technologies are investigated to find the right replacement for the future computing systems. A number of dif- ferent fabrics and architectures are currently under investigation. Unfortunately, at this time, no unified modeling exists to offer sound support for algorithmic design space exploration, with no compromise on device feasibility. This work presents a NASIC-compliant application-specific computing architecture template along with its performance models and optimization policies that support domain-space ex- ploration. This architecture has up to 29X density advantage over CMOS, is completely compatible with the NASIC manufacturing pathway, and enables the creation of unique max-rate pipelined systems